智能论文笔记

Enhanced Frame and Event-Based Simulator and Event-Based Video Interpolation Network

Adam Radomski , Andreas Georgiou , Thomas Debrunner , Chenghan Li , Luca Longinotti , Minwon Seo , Moosung Kwak , Chang-Woo Shin , Paul K. J. Park , Hyunsurk Eric Ryu

分类：计算机视觉

2021-12-17

基于快速的神经形态的视觉传感器（动态视觉传感器，DVS）可以与基于较慢的帧的传感器组合，以实现比使用例如固定运动近似的传统方法更高质量的帧间内插。光流。在这项工作中，我们展示了一个新的高级事件模拟器，可以产生由相机钻机录制的现实场景，该仪器具有位于固定偏移的任意数量的传感器。它包括具有现实图像质量降低效果的新型可配置帧的图像传感器模型，以及具有更精确的特性的扩展DVS模型。我们使用我们的模拟器培训一个新的重建模型，专为高FPS视频的端到端重建而设计。与以前发表的方法不同，我们的方法不需要帧和DVS相机具有相同的光学，位置或相机分辨率。它还不限于物体与传感器的固定距离。我们表明我们的模拟器生成的数据可用于训练我们的新模型，导致在与最先进的公共数据集上的公共数据集中的重建图像。我们还向传感器展示了真实传感器记录的数据。

translated by 谷歌翻译

2-hop Neighbor Class Similarity (2NCS): A graph structural metric indicative of graph neural network performance

Andrea Cavallo , Claas Grohnfeldt , Michele Russo , Giulio Lovisotto , Luca Vassio

分类：机器学习

2022-12-26

Graph Neural Networks (GNNs) achieve state-of-the-art performance on graph-structured data across numerous domains. Their underlying ability to represent nodes as summaries of their vicinities has proven effective for homophilous graphs in particular, in which same-type nodes tend to connect. On heterophilous graphs, in which different-type nodes are likely connected, GNNs perform less consistently, as neighborhood information might be less representative or even misleading. On the other hand, GNN performance is not inferior on all heterophilous graphs, and there is a lack of understanding of what other graph properties affect GNN performance. In this work, we highlight the limitations of the widely used homophily ratio and the recent Cross-Class Neighborhood Similarity (CCNS) metric in estimating GNN performance. To overcome these limitations, we introduce 2-hop Neighbor Class Similarity (2NCS), a new quantitative graph structural property that correlates with GNN performance more strongly and consistently than alternative metrics. 2NCS considers two-hop neighborhoods as a theoretically derived consequence of the two-step label propagation process governing GCN's training-inference process. Experiments on one synthetic and eight real-world graph datasets confirm consistent improvements over existing metrics in estimating the accuracy of GCN- and GAT-based architectures on the node classification task.

translated by 谷歌翻译

hxtorch.snn: Machine-learning-inspired Spiking Neural Network Modeling on BrainScaleS-2

Philipp Spilger , Elias Arnold , Luca Blessing , Christian Mauch , Christian Pehle , Eric Müller , Johannes Schemmel

分类：神经与进化计算

2022-12-23

Neuromorphic systems require user-friendly software to support the design and optimization of experiments. In this work, we address this need by presenting our development of a machine learning-based modeling framework for the BrainScaleS-2 neuromorphic system. This work represents an improvement over previous efforts, which either focused on the matrix-multiplication mode of BrainScaleS-2 or lacked full automation. Our framework, called hxtorch.snn, enables the hardware-in-the-loop training of spiking neural networks within PyTorch, including support for auto differentiation in a fully-automated hardware experiment workflow. In addition, hxtorch.snn facilitates seamless transitions between emulating on hardware and simulating in software. We demonstrate the capabilities of hxtorch.snn on a classification task using the Yin-Yang dataset employing a gradient-based approach with surrogate gradients and densely sampled membrane observations from the BrainScaleS-2 hardware system.

translated by 谷歌翻译

Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation

Gyan Tatiya , Jonathan Francis , Luca Bondi , Ingrid Navarro , Eric Nyberg , Jivko Sinapov , Jean Oh

分类：机器人 | 人工智能 | 计算机视觉

2022-12-21

Generalisation to unseen contexts remains a challenge for embodied navigation agents. In the context of semantic audio-visual navigation (SAVi) tasks, the notion of generalisation should include both generalising to unseen indoor visual scenes as well as generalising to unheard sounding objects. However, previous SAVi task definitions do not include evaluation conditions on truly novel sounding objects, resorting instead to evaluating agents on unheard sound clips of known objects; meanwhile, previous SAVi methods do not include explicit mechanisms for incorporating domain knowledge about object and region semantics. These weaknesses limit the development and assessment of models' abilities to generalise their learned experience. In this work, we introduce the use of knowledge-driven scene priors in the semantic audio-visual embodied navigation task: we combine semantic information from our novel knowledge graph that encodes object-region relations, spatial knowledge from dual Graph Encoder Networks, and background knowledge from a series of pre-training tasks -- all within a reinforcement learning framework for audio-visual navigation. We also define a new audio-visual navigation sub-task, where agents are evaluated on novel sounding objects, as opposed to unheard clips of known objects. We show improvements over strong baselines in generalisation to unseen regions and novel sounding objects, within the Habitat-Matterport3D simulation environment, under the SoundSpaces task.

translated by 谷歌翻译

Exploring the Challenges of Open Domain Multi-Document Summarization

John Giorgi , Luca Soldaini , Bo Wang , Gary Bader , Kyle Lo , Lucy Lu Wang , Arman Cohan

分类：自然语言处理 | 人工智能

2022-12-20

Multi-document summarization (MDS) has traditionally been studied assuming a set of ground-truth topic-related input documents is provided. In practice, the input document set is unlikely to be available a priori and would need to be retrieved based on an information need, a setting we call open-domain MDS. We experiment with current state-of-the-art retrieval and summarization models on several popular MDS datasets extended to the open-domain setting. We find that existing summarizers suffer large reductions in performance when applied as-is to this more realistic task, though training summarizers with retrieved inputs can reduce their sensitivity retrieval errors. To further probe these findings, we conduct perturbation experiments on summarizer inputs to study the impact of different types of document retrieval errors. Based on our results, we provide practical guidelines to help facilitate a shift to open-domain MDS. We release our code and experimental results alongside all data or model artifacts created during our investigation.

translated by 谷歌翻译

Taming Lagrangian Chaos with Multi-Objective Reinforcement Learning

Chiara Calascibetta , Luca Biferale , Francesco Borra , Antonio Celani , Massimo Cencini

分类：机器学习

2022-12-19

We consider the problem of two active particles in 2D complex flows with the multi-objective goals of minimizing both the dispersion rate and the energy consumption of the pair. We approach the problem by means of Multi Objective Reinforcement Learning (MORL), combining scalarization techniques together with a Q-learning algorithm, for Lagrangian drifters that have variable swimming velocity. We show that MORL is able to find a set of trade-off solutions forming an optimal Pareto frontier. As a benchmark, we show that a set of heuristic strategies are dominated by the MORL solutions. We consider the situation in which the agents cannot update their control variables continuously, but only after a discrete (decision) time, $\tau$. We show that there is a range of decision times, in between the Lyapunov time and the continuous updating limit, where Reinforcement Learning finds strategies that significantly improve over heuristics. In particular, we discuss how large decision times require enhanced knowledge of the flow, whereas for smaller $\tau$ all a priori heuristic strategies become Pareto optimal.

translated by 谷歌翻译

Robust Explanation Constraints for Neural Networks

Matthew Wicker , Juyeon Heo , Luca Costabello , Adrian Weller

分类：机器学习

2022-12-16

Post-hoc explanation methods are used with the intent of providing insights about neural networks and are sometimes said to help engender trust in their outputs. However, popular explanations methods have been found to be fragile to minor perturbations of input features or model parameters. Relying on constraint relaxation techniques from non-convex optimization, we develop a method that upper-bounds the largest change an adversary can make to a gradient-based explanation via bounded manipulation of either the input features or model parameters. By propagating a compact input or parameter set as symbolic intervals through the forwards and backwards computations of the neural network we can formally certify the robustness of gradient-based explanations. Our bounds are differentiable, hence we can incorporate provable explanation robustness into neural network training. Empirically, our method surpasses the robustness provided by previous heuristic approaches. We find that our training method is the only method able to learn neural networks with certificates of explanation robustness across all six datasets tested.

translated by 谷歌翻译

Objaverse: A Universe of Annotated 3D Objects

Matt Deitke , Dustin Schwenk , Jordi Salvador , Luca Weihs , Oscar Michel , Eli VanderBilt , Ludwig Schmidt , Kiana Ehsani , Aniruddha Kembhavi , Ali Farhadi

分类：计算机视觉 | 人工智能 | 机器人

2022-12-15

Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and LAION have propelled recent dramatic progress in AI. Large neural models trained on such datasets produce impressive results and top many of today's benchmarks. A notable omission within this family of large-scale datasets is 3D data. Despite considerable interest and potential applications in 3D vision, datasets of high-fidelity 3D models continue to be mid-sized with limited diversity of object categories. Addressing this gap, we present Objaverse 1.0, a large dataset of objects with 800K+ (and growing) 3D models with descriptive captions, tags, and animations. Objaverse improves upon present day 3D repositories in terms of scale, number of categories, and in the visual diversity of instances within a category. We demonstrate the large potential of Objaverse via four diverse applications: training generative 3D models, improving tail category segmentation on the LVIS benchmark, training open-vocabulary object-navigation models for Embodied AI, and creating a new benchmark for robustness analysis of vision models. Objaverse can open new directions for research and enable new applications across the field of AI.

translated by 谷歌翻译

A Survey on Reinforcement Learning Security with Application to Autonomous Driving

Ambra Demontis , Maura Pintor , Luca Demetrio , Kathrin Grosse , Hsiao-Ying Lin , Chengfang Fang , Battista Biggio , Fabio Roli

分类：机器学习 | 机器人

2022-12-12

Reinforcement learning allows machines to learn from their own experience. Nowadays, it is used in safety-critical applications, such as autonomous driving, despite being vulnerable to attacks carefully crafted to either prevent that the reinforcement learning algorithm learns an effective and reliable policy, or to induce the trained agent to make a wrong decision. The literature about the security of reinforcement learning is rapidly growing, and some surveys have been proposed to shed light on this field. However, their categorizations are insufficient for choosing an appropriate defense given the kind of system at hand. In our survey, we do not only overcome this limitation by considering a different perspective, but we also discuss the applicability of state-of-the-art attacks and defenses when reinforcement learning algorithms are used in the context of autonomous driving.

translated by 谷歌翻译

Prompting Is Programming: A Query Language For Large Language Models

Luca Beurer-Kellner , Marc Fischer , Martin Vechev

分类：自然语言处理 | 人工智能

2022-12-12

Large language models have demonstrated outstanding performance on a wide range of tasks such as question answering and code generation. On a high level, given an input, a language model can be used to automatically complete the sequence in a statistically-likely way. Based on this, users prompt these models with language instructions or examples, to implement a variety of downstream tasks. Advanced prompting methods can even imply interaction between the language model, a user, and external tools such as calculators. However, to obtain state-of-the-art performance or adapt language models for specific tasks, complex task- and model-specific programs have to be implemented, which may still require ad-hoc interaction. Based on this, we present the novel idea of Language Model Programming (LMP). LMP generalizes language model prompting from pure text prompts to an intuitive combination of text prompting and scripting. Additionally, LMP allows constraints to be specified over the language model output. This enables easy adaption to many tasks, while abstracting language model internals and providing high-level semantics. To enable LMP, we implement LMQL (short for Language Model Query Language), which leverages the constraints and control flow from an LMP prompt to generate an efficient inference procedure that minimizes the number of expensive calls to the underlying language model. We show that LMQL can capture a wide range of state-of-the-art prompting methods in an intuitive way, especially facilitating interactive flows that are challenging to implement with existing high-level APIs. Our evaluation shows that we retain or increase the accuracy on several downstream tasks, while also significantly reducing the required amount of computation or cost in the case of pay-to-use APIs (13-85% cost savings).

translated by 谷歌翻译